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Suppose  It  is  a finit-e  population  of  N distinct  units. 
Let  8 be  the  set  of  all  subsets  based  on  the  elements  of 
U.  A sampling  design,  d,  based  on  U is  a pair  (S^,P^), 
where  S^  is  a subset  of  g and  = {pj(s),  s e S^J  is 

a probability  distribution  on  S^.  To  guarantee  the  esti- 
mability  of  the  basic  parameters  of  U,  such  as  the  popula- 
tion total,  we  insist  that  the  union  of  the  subsets  in 

be  It  and  P(j(s)  > 0 for  each  s in  S^. 

The  first  order  inclusion  probability  of  the  unit  i 
under  d is  defined  to  be 

n^(i)  = T.  P^(s), 

and  the  second  order  (joint)  inclusion  probability  of  the 
units  i and  J (i  4 J)  'under  d is 

s3i,J  ^ 


1.  Invited  paper  presented  at  International  Conference  on 
Optimization  in  Statistics  held  in  IIT,  Bombay,  India> 
during  December  20-22,  1977. 

2.  Research  supported  by  Grant  APOSR  76-3O5OA. 


□ □ 


1. 


Since  the  Introduction  of  unequal  probability  sampling 
by  Horvitz  and  Thompson  (1952),  the  emphasis  in  the  theory 
has  been  towards  working  with  the  above  inclusion  proba- 
bilities. This  paper  is  mainly  concerned  with  problems  re- 
lated to  these  Inclusion  probabilities. 

Two  sampling  designs  d^  and  dg  are  said  to  be  equi- 
valent with  respect  to  these  inclusion  probabilities  if; 

n^id)  = 

This  paper  studies  the  extent  to  which  these  inclusion  prob- 
abilities characterize  the  sampling  designs.  This  study  has 
led  us  to  sampling  designs  which  have  applications  in  con- 
trolled sampling.  We  have  studied  the  following  problem, 
among  others.  The  classical  simple  random  sample  of  size  n 
based  on  it,  denoted  SRS(N,n),  is  a sampling  design  whose 
support,  S^,  consists  of  all  possible  samples  of  size  n 

and  whose  probability  distribution  » » is  uniform  on  the 

support.  Thus  a problem  of  interest  is  to  find  sampling  de- 
signs equivalent  to  SRS(N,n)  but  whose  support  sizes  may  be 
less  than  and  for  which  the  probability  distribution 

on  their  supports  may  or  may  not  be  uniform.  It  is  shown 
that  this  can  always  be  done.  Such  sampling  designs  have 
applications  to  controlled  sampling. 


I 


2. 


1.  Preliminaries 

Let  U = ^ population  of  N identi- 

fiable units.  Let  g be  the  pov/er  set  of  U,  l.e.,  the  set 
of  all  subsets  based  on  the  elements  of  U.  Note  that  the 
cardinality  (size)  of  ? is  2 . Hereafter  we  shall  refer 
to  the  units  in  li  by  their  indices.  Thus  the  unit 
will  be  referred  to  by  "l". 

Definition  1.1.  A sampling  design,  d,  based  on  U is  a 

pair  (S^,  P^),  where  is  a subset  of  g and 

P^  = (Pjj(s),  s e S^j  is  a probability  distribution  on  S^. 

To  guarantee  the  estlmabillty  of  the  basic  parameters 

of  U we  insist  that  the  union  of  the  subsets  in  S,  be 

a 

U and  P(j(s)  > 0 for  each  s in  S^. 

Throughout  the  paper  the  cardinality  of  a set  Z will 
be  denoted  by  c(z). 

Definition  1.2 . is  called  the  support  of  d and  c(Sj) 

is  called  the  support  size  of  d. 

Definition  1.3«  A sampling  design  is  said  to  be  a uniform 
sampling  design  if  P^  is  uniform  on  S^. 


N 

(l)  Such  as  the  population  total  Y = 2 Y. , where  Y.  is 

1=1  ^ ^ 

the  value  of  a real-valued  function  on  the  unit  1. 
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Definition  1.4.  , We  say  a ssirapllng  design  d is  of  sire  n 
if  c(s)  = n for  all  s in  S^. 

In  the  sequel  we  shall  refer  to  the  elements  of  as 

samples  and  a sample  in  selected  by  Implementing 

as  a probability  sample.  Perhaps  the  most  adopted  sampling 
design  in  practice  is  a sampling  design  which  is  known  as 
a simple  random  sample  design  of  size  n which  can  be  de- 
fined under  our  notation  as : 

Definition  1.5 . A sampling  design  (S^,  P^)  based  on  U 
of  size  N is  said  to  be  a simple  random  sample  design  of 
size  n,  SRS(N,  n) , if 

(l)  consists  of  all  subsets  of  size  n based 

on  U, 

(ii)  Pj  is  uniform  on  S^,  i.e.,  p(s)  = ^(n)* 

In  this  paper  we  shall  deal  with  sampling  designs  whose 
first  order  and  second  order  inclusion  probabilities  are 
identical  to  the  corresponding  probabilities  of  simple 
random  sample  designs  but  whose  support  sizes  may  be  less 
than  and  for  which  P^  may  or  may  not  be  uniform. 

Such  sampling  designs  have  applications  in  the  area  of 
controlled  sampling.  The  first  order  and  second  order 
inclusion  probabilities  are  studied  in  Section  2. 


2 . Inc lus ion  probabilities . 


Since  the  introduction  of  unequal  probability  sampling 
by  Horvitz  and  Thompson  (1952)  the  emphasis  in  the  theory 
has  been  towards  working  with  the  first  and  second  order 
Inclusion  probabilities  associated  with  sampling  designs. 
These  probabilities  are  defined  as: 

The  first  order  inclusion  probability  associated  with 
the  unit  i in  U under  the  sampling  design  d = (S^,  P^) 
is 

n (1)  = E p (s).  (1) 

S9i 

This  is  the  probability  of  selecting  the  unit  i if  we 
implement  the  sampling  design  d. 

The  second  order  (joint)  inclusion  probability  associated 
with  the  units  i and  J (i  4 j)  in  U under  d is 

n.(i,j)  = r Prt(s).  (2) 

This  is  the  probability  of  simultaneously  selecting  the 
units  1 and  J if  we  implement  the  sampling  design  d. 

Some  known  linear  constraints  on  the  inclusion  probabi- 
lities n^'s  and 

Proposition  2.1.  Under  the  sampling  design  d 

N 

E n (i)  = E c(s)p.(s), 
i=l  ° s ° 


(5) 


'■VWL'- 


5. 


N 


^ [c(s)-l]p  (s), 
S5J 


(4) 


N N 


iSl  " S (5) 


Corollary  2.1.  d J|^  a sampling  design  of  size  n then 


N 


N 


^y^(l)  = n,  = (n-l)lT^(j),  J = 1,2,. ...N 


N N 

S n (1,J)  = n(n-l). 

i=l  j(4i)  ^ 


(6) 


Thus  there  are  N + 1 distinct  linear  constraints  on 

If  the  samples  In  are  not 

Identical  In  size,  then  the  expected  sample  size  under  d 


Is 


expected  sample  size  = S c(s)p. (s) 

s “ 


N 


which  is  precisely  E n (i)  and  thus  it  should  not  be 


1=1 


surprising  that  when  d is  a sampling  design  of  size  n 
then 


N 


^ n (1)  = n 
1=1  “ 


whether  or  not  d Is  uniform  on  its  support . 


; :! 
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3.  The  problems  and  background. 

Because  of  the  importance  of  n^(s)'s  and  n^(i,j)*s 
in  the  theory  of  sampling  it  is  interesting  to  investigate 
the  extent  to  which  these  inclusion  probabilities  character- 
ize the  sampling  designs.  For  example,  = n/N  and 

= n(n-l)/N(N-l)  if  the  sampling  design  is  SRS(N,n). 
Then  is  it  true  that  SRS(N,n)  is  the  only  design  with 
n^(i)  = n/N  and  n^(i,j)  = n(n-l)/N(N-l)?  The  answer  is 
no.  Indeed,  such  sampling  designs  exist  which  violate  one 
or  both  conditions  of  SRS(N,n)  specified  in  Definition  1.5. 
To  formalize  our  problems  we  need  the  following  definition. 

Definition  3.1.  Two  sampling  designs  d^^  and  dg  based 
on  U are  said  to  be  equivalent  with  respect  to  the  first 
order  and  second  order  inclusion  probabilities  if 

n (i)  = n (1)  and  n (i,j)  = n (i,j)  (7) 

Oi  dg  hg 

Hereafter,  for  simplicity,  we  shall  say  two  sampling 

designs  dj^  and  dg  are  equivalent  (designated  by  dj^  « dg) 

if  they  are  equivalent  in  the  sense  of  Definition  3.1.  Note 

that  the  condition  n.  (i)  = IT,  (i)  implies  that  in  order 

°i  “2 

di  « dg  it  is  necessary  that  the  expected  sample  size  under 
dj^  should  be  equal  to  the  expected  sample  size  under  dg, 
a natural  demand  for  the  concept  to  be  practically  meaningful. 


i 


I . 

Problem  1.  Given  a sampling  design  d^,  what  is  the  mini- 
mum support  size  of  a sampling  design  dg  equivalent  to  d^? 

Problem  2.  Suppose  we  are  given  a sampling  design  d-j^  and 
a sampling  design  dg  whose  support  size  is  minimum  and  is 
equivalent  to  d-, . Let  M,  and  M . be  the  support  size 

j.  Og 

of  d^  and  dg  respectively.  Then  for  what  value  of  M, 

M.  < M < M,  , is  there  a sampling  design  with  support  size 
°2  '^1 

M which  is  equivalent  to  d^? 

These  problems  have  not  been  fully  solved  as  of  to-day. 

Our  experience  Indicated  that  they  will  remain  unsolved  for 
many  years  to  come.  Solutions  to  some  aspects  of  these 
problems  have  been  obtained  by  Chakrabarti  (1963),  Wynn  (1977), 
Foody  and  Hedayat  (1977),  Hedayat  and  Li  (1977)  and  Hedayat 
and  Rao  (1978). 

Chakrabarti  (1963)  noticed  that  one  can  relate  a balanced 
Incomplete  block  (BIB)  design  based  on  N treatments  in  b 
blocks  of  size  n to  a sampling  design  by  considering  the 
blocks  as  samples  and  treatments  as  units  and  letting 
p(s)  = l/b  where  s is  a block  of  the  design.  Thus  he 
proved  that 

Theorem  3.1.  A uniform  sampling  design  of  size  n bas ed  on 
a population  of  size  N is  equivalent  ^ SRS(N,n)  if  and 
only  if  it  is  associated  with  a BIB  design  with  no  repeated 


8. 


"blocks  on  N treatments  in  blocks  of  size  n. 

Chakrabarti  (I963)  did  not  ^ive  any  practical  applica- 

/N\ 

tions  of  sampling  designs  with  support  size  less  than  (J 
and  equivalent  to  SRS(N,n).  Perhaps  due  to  this  lack  of 
practical  motivation  of  the  problem  solved  by  Chakrabarti, 
no  further  works  on  this  subject  came  to  print  for  a decade. 

Avadhani  and  Sukhatme  (1973)  discussed  sampling  designs 
associated  with  BIB  designs  and  gave  some  meaningful  prac- 
tical applications  of  such  designs  in  controlled  sampling. 

For  actual  examples  of  controlled  sampling  see,  for  example, 
Goodman  and  Kish  (1950)  and  Avadhani  and  Sukhatme  (1973). 

A more  systematic  study  of  Problem  1 was  done  by  Wynn 
(1977),  who  used  Caratheodory ' s theorem  [see  Rockafellar 
(1970),  p.  151]  and,  for  example,  proved  that 

Theorem  3.2 . if  a sampling  design  of  size  n based 

on  a population  of  size  N then  there  is  a sampling  design 
dg  « with  support  size  no  greater  than  N(N-l)/2. 

For  example,  if  d^^  is  SRS(8,3)  then  there  is  a 
sampling  design  dg  » d^  whose  support  size  is  no  greater 
than  8(7)/2  = 28.  For  N = 8 and  n = 3 the  lower  bound 
28  is  not  sharp.  Wynn  (1977)  gave  an  example  of  a sampling 
design  with  support  size  24  equivalent  to  SRS(8,3). 
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Foody  and  Hedayat  (1977)  formalized  the  concept  of 
sampling  design  of  size  n based  on  a population  of  size 
N equivalent  to  SRS(N,n)  in  the  language  of  matrix  alge- 
bra and  mathematical  programming  and  obtained  several  re- 
sults in  the  terminology  of  BIB  designs  with  repeated  blocks. 
To  point  out  some  of  their  results  and  present  further  work 
in  the  area  we  need  some  notation  and  definitions  which  are 
given  in  Section  4.  In  the  rest  of  the  paper  we  shall  limit 
our  study  to  sampling  designs  of  size  n. 


4.  Sampling  designs  in  the  language  of  matrix  algebra  and 
mathematical  programming. 

A 2-element  subset  of  U of  size  N will  be  called  a 

pair  and  an  n-element  subset  will  be  called  a sample  of 

size  n.  Let  P denote  the  incidence  matrix  (do  not  confuse 

with  P^)  of  pairs  versus  blocks.  So  P is  a by 

'Nn 


N\  / NN 

zero-one  matrix.  Order  the  T j samples  of  size  n 

in  some  fashion  and  let  D be  a multiset  (a  set  which 

allow  the  elements  to  appear  with  multiplicity)  based  on 

(n)  samples  of  size  n.  We  write  fj^  for  the  frequency 

of  the  ith  sample  of  size  n in  D.  Let 

Fjj  B Foody  and  Hedayat  (1977)  proved 


that 
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Theorem  4.1.  A frequency  vector  Fj^  determines  a sampling; 
design  equivalent  to  SRS(N,n)  and  only  If 

PFjj  = XI  (8) 

where  X JLs  a positive  Integer  and  1 3^  a column  vector 

of  all  ones . 


Proof.  The  sampling  design,  d,  associated  with  Fj^  can  be 
constructed  as  follows:  Let  consist  of  those  n-element 

subsets  of  u whose  corresponding  f's  In  F^  Is  not 
zero.  The  probability  associated  with  a sample  in  will 

be  the  corresponding  f divided  by  !Cf  . Now  by  (8) 

rC 


nd(i) 


nSf,  /N 
Ic 


n 

N 


and 


X n(n-l)EVN(N-l) 


which  shows  that  d « SRS(N,n).  The  necessity  part  of  the 
theorem  can  be  similarly  proved. 

At  this  point  we  would  like  to  point  out  that  the  sampling 
design  d associated  with  Fjj  will  have  support  size  less 
than  if  one  or  more  components  of  Fj^  are  zero  and 

d will  be  nonuniform  if  there  exist  1 4 J such  that 

^1  ^ 
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Example  4.1. 

Let  N = 8 and 

k = 3.  Then 

SRS(8,3)  has 

support  size 

= 56.  The  probability  of  1 

each 

sample  is 

1/56.  Based 

on  Theorem  3.I  we 

exhibit  below 

a sampling 

design  equivalent  to  SRS(8,3) 

which  is  nonuniforra  and  has 

support  size 

22. 

Sample 

Probability 

Sample 

Probability 

125 

1/56 

347 

2/56 

137 

1/56 

128 

3/56 

146 

I/5S 

178 

3/56 

245 

1/56 

268 

3/56 

246 

1/56 

468 

3/56 

367 

1/56 

478 

3/56 

467 

1/56 

234 

4/56 

127 

2/56 

567 

4/56 

237 

2/56 

136 

5/56 

256 

2/56 

145 

5/56 

257 

2/56 

358 

6/56 

Clearly  the  above  is  a sampling  design  and  the  reader 

can  check  for 

' himself  that  for 

this  design 

n^(l) 

= -g  = 1 and 

6 

as  in  the  case  of  SRS(8,3)* 

Table  1 in  Foody  and  Hedayat  (1977)  provides 

BIB  designs 

which  can  be 

converted  to  nonuniform  sampling  designs  with 

all  possible 

support  sizes  22 

to  55  when 

N = 

8 and 

n = 3.  In  this  case  SRS(8,3) 

is  the  only  uniform  sampling 

design. 
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Theorem  3.1  says  that  each  feasible  solution  of 
the  system 

PFp  = XI.  Fjj  > 0 (9) 

corresponds  to  a sampling  design  equivalent  to  SRS(N,n). 

The  set  of  all  rational  feasible  solutions  to  this  system 
corresponds  to  all  sampling  designs  equivalent  to  SRS(N,n). 
Now  there  is  always  at  least  one  rational  feasible  solution 
to  (9),  namely  the  solution  corresponding  to  SRS(N,n) 
with  f^  = 1 and  X = ( j * Using  the  language  of  mathe- 
matical programming  we  know  that  all  feasible  solutions  to 
(9)  are  convex  combinations  of  the  basic  feasible  solutions, 
so  the  search  for  all  sampling  designs  equivalent  to 
SRS(N,n)  reduces  to  finding  all  basic  feasible  solutions 
to  (9);  that  is,  to  finding  all  of  the  vertices  of  the 
polytope  defined  by  (9). 

In  practice  vie  are  not,  of  course.  Interested  in  finding 
all  solutions  to  (9).  Rather,  we  seek  a solution  which  ex- 
cludes, or  at  least  minimizes  the  selection  probability  of 
certain  samples.  We  may  find  such  a sampling  design  by 
Introducing  an  objective  function  which  assigns  positive 
cost  to  the  samples  which  we  wish  to  avoid  and  zero  cost 
to  the  other  samples.  The  standard  linear  programming 
algorithms  for  minimizing  this  objective  function  will  then 
produce  the  desired  design. 


13. 


5.  Bounds  on  the  support  size  of  a sampling  design  equiva- 
lent ^ SHS(N,n) . *" 

* 

Let  d be  a sampling  design  whose  support  size, 

Is  minimum  among  all  sampling  designs  equivalent  to  SRS(N,n). 
Then  we  have 

(10) 

where  (xj  denotes  the  smallest  Integer  greater  than  or 
equal  to  x.  Though  the  upper  bound  Is  already  stated  In 
Theorem  3*2,  we  can  prove  It  easily  by  representing  d*  In 
It  equivalent  form 

PFjj*  = X*1  . (11) 

Now  recall  that  P has  an  rows  and  columns. 

Since  = N(N-l)/2  < thus  rank  of  P Is  at  most 

N(N-1)/2  [Indeed  It  Is  precisely  N(N-1)/2  by  Lemma  5.1 
of  Foody  and  Hedayat  (1977)]*  Therefore  X*1  can  be  ex- 
pressed as  a linear  combination  of  at  most  N(N-l)  columns 
of  P meaning  that  Fj^*  has  at  most  N(N-l)/2  nonzero 
components.  Thus  < N(N-l)/2. 

To  prove  the  lower  bound,  note  that  n^*(l,j)  > 1.  Thus 
to  cover  all  pairs,  each  element  of  the  population  must 
appear  In  at  least  [ (N-l)/(n-l) j distinct  samples  In  the 
support  of  d . Now  let  m be  the  smallest  number  of 
samples  of  size  n needed  to  cover  pairs.  Thus,  the 
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average  number  of  distinct  samples  in  which  each  element 
appears,  mn/N,  must  be  at  least  ( (N-l)/(n-l) | , Thus 
V > {(N/n)((N-l)/(n-l) ]]. 

1. 

There  are  infinitely  many  N's  and  n's  for  which 
sampling  designs  equivalent  to  SRS(N,n)  exist  and  have 
support  size  ( (N/n) { (N-l)/(n-l) ] ) . Therefore,  the  lower 
bound  in  (10)  is  sharp.  As  an  example,  let  N = 7,  and 
n = 5 . Then  we  have : 

Example  5.1.  Below  is  a sampling  design  with  the  minimum 
support  size  which  is  equivalent  to  SRS(7,3). 

Sample Probability  Sample Probability 


124 

1/7 

561 

1/7 

235 

1/7 

672 

1/7 

346 

1/7 

713 

1/7 

457 

1/7 

Note  that 

in  this  case 

((N/n){(N-l)/(n-l))} 

= 7. 

There 

are  N • s and 

n's  for  which  the 

lower  bound 

(lO)  is  much  too  large.  For  example,  when  N = 8 and  n = 3 
the  lower  bound  in  (lO)  becomes  11.  But  from  Foody  and 
Hedayat  (1977)  and  Pesotchinsky  (1977)  we  know  that  in  this 
case  the  minimum  support  size  is  22.  In  Example  4.1  we 
exhibited  such  a sampling  design. 
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The  problem  of  the  characterization  of  all  N's  and 
n's  for  which  the  lower  bound  In  (lO)  is  achievable  remains 
unsolved  and  it  is  a hard  problem  indeed.  It  is  interesting 
to  study  the  properties  of  and  if  the  support  size 

of  d is  minimum,  as  given  in  (10).  One  thing  which  we  can 
say  is  this:  If  such  a sampling  design  exists  and  if 
{ (N/n) { (N-l)/(n-l) ]]  = N then  d must  be  a uniform  sampling 
design  and,  moreover,  d exists  if  and  only  if  there  is  a 
BIB  design  with  N blocks  of  size  n based  on  N treat- 
ments . 

Note  that  ( (N/n) ( (N-l)/(n-l) J j > N and  thus  there  is 
no  sampling  design  equivalent  to  SRS(N,n)  and  having 
support  size  less  than  N.  Another  question  of  interest 
is:  Is  there  any  sampling  design  equivalent  to  SRS(N,n) 
with  support  size  N + 1?  The  following  proposition  answers 
this  question. 

Proposition  5.1.  There  is  no  sampling  design,  d,  of  size  n 
based  on  a population  of  size  N with  properties : 

(1)  d « SRS(N,n)  and  (ii)  c(S^)  = N + 1. 

Assume  to  the  contrary  and  let  d be  such  a sampling 
design.  It  is  not  difficult  to  see  that  there  is  an  Integer 
6 such  that  6Pjj(s)  is  an  integer  for  all  s in  the  sup- 
port of  d.  Then  the  samples  in  together  with  6p^(s)'s. 

form  a BIB  design  based  on  N + 1 distinct  blocks.  This 
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contradicts  Theorem  3.2  of  van  Lint  and  Ryser  (1972)  which 
says  there  is  no  BIB  design  with  precisely  N + 1 distinct 
blocks . 

In  regard  to  problems  listed  in  Section  3,  our  experi- 
ence indicates  that  if  d^  is  a SRS(N,n)  and  if  there 
exists  a sampling,  design,  dg,  equivalent  to  SRS(N,n)  with 
support  size  N then  there  are  integers  between  N (the 
support  of  dg)  and  (the  support  of  SRS(N,n))  for 

which  there  are  no  sampling  designs  with  such  support  sizes 
and  equivalent  to  SRS(N,n).  The  result  in  Proposition  5.1 
gives  one  such  integer  for  arbitrary  N and  n.  Let  us 
consider  the  case  of  N = 7 and  n = 3.  In  this  case 
there  are  no  sampling  designs  equivalent  to  SRS(7.3)  with 
support  sizes  8,  9»  10,  12.  Whether  or  not  there  is  a 
sampling  design  with  support  size  16  is  unknown  to  this 
writer.  However,  if  M is  an  integer  between  7 and  35* 
and  M 4 8,  9,  10,  12,  16,  then  there  is  a sampling  design 
with  support  size  M and  equivalent  to  SRS(7,3)>  according 
to  Hedayat  and  Li  (1977). 

If  the  minimum  support  size  is  not  N then  we  know 
very  little  about  the  support  sizes  of  the  sampling  designs 
equivalent  to  SRS(N,n),  Whether  or  not  the  case  of  N = 8 
and  n = 3 indicates  something  is  not  clear  to  us.  In 
this  case  the  minimum  possible  support  size  is  22  and,  as 
Foody  and  Hedayat  (1977)  have  shown,  for  every  integer 

i 

I, 
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22  M < 5?  there  is  a sampling  design  with  support  size 
M and  equivalent  to  SRS(8,3).  Any  such  sampling  design 
for  N = 8 and  n = 3 will  be  nonuniform. 

In  Section  6 we  shall  study  the  method  of  trade  off 
which  is  a very  useful  technique  for  finding  sampling  de- 
signs with  support  size  smaller  thah  and  equivalent 

to  SRS(N,n) . 


6.  The  method  of  trade  off  and  its  application  in  sampling. 

The  idea  of  trade  off  is  as  follows : For  given  N and 
n we  shall  write  down  SRS(N,n).  Then  in  order  to  reduce 
the  support  of  SRS(N,n)  we  shall  try  to  find  two  sets  of 
samples,  and  Sg,  in  the  support  of  SRS(N,n)  such  that 
it  is  possible  to  remove  Sg  from  the  support  and  assign 
the  related  probabilities  to  samples  in  in  such  a 

fashion  that  the  resulting  sampling  design  is  equivalent  to 
SRS(N,n).  If  this  can  be  done,  then  we  say  Sg  has  been 
traded  off  for  S^.  But  the  theory  which  will  be  presented 
is  much  broader  than  this. 

Recall  the  notation  of  Section  4 and  let  T be  a non- 
zero integer  column  vector  of  dimension  . 
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Definition  6.1.  The  vector  T is  called  a trade  if 

PT  = 0,  (12) 

and  the  sum  of  all  positive  entries  in  a trade  is  called 
its  volume . 

Ignore  the  entries  of  T which  are  zero  and  let 
^l*^2*'***^g  positive  components  and 

t^,t2,...,t^  denote  the  negative  components.  Thus 

St^  + Et'^  = 0 . (13) 


Be  definition  of  P and  the  existence  of  T we  can 
immediately  Identify  two  sets  of  samples  of  size  n 


» Sg > • • • > Sg  j , Sg  — { s^ j Sg > . . . , Sj^  j , n Sg  = 0^ 

such  that  if  (x,y)  is  a pair  of  elements  in  some  sample 
of  then 

S t.  + E t'  = 0.  (14) 

Si3(x,y)  ^ s^9(x,y) 


Conversely,  if  we  are  given  two  sets  of  samples  and  two 
sets  of  Integers  of  the  form; 


sample Integer 


s 

g 


t 

g 


sample 


Integer 


’1 


^1 

to 


®i 


A 
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1 


i 

1 


with  properties  (15)  and  (14)  we  can  Immediately  write  down 


a vector  T 

which  is  a trade.  The  following  example  eluci 

dates  the  above 

argument . 

Example  6 . 1 

. Let  N = 7 and  n 

= 3. 

Consider  the  follovj- 

Ing  samples 

and 

related  Integers 

sample 

integer 

sample 

integer 

125 

2 

124 

-1 

145 

1 

125 

-1 

156 

1 

135 

-1 

246 

1 

136 

-1 

257 

1 

256 

-1 

556 

1 

257 

-1 

367 

1 

456 

-1 

567 

-1 

The  reader  can  check  for  himself  that  these  two  sets  of 
samples  and  corresponding  integers  satisfy  (15)  and  (1^). 

As  an  example  let  (xy)  = (12)  we  see  that  the  sample  125 
contains  (12)  with  t = 2.  In  the  set  of  samples  with  nega- 
tive integers  there  are  two  samples  124  and  125  with 
t^  =*  -1  and  tg  = -1. 

To  write  the  corresponding  vector  T associated  with 
Example  6,1,  let  T be  a column  vector  of  size  =»  55 » 

Note  that  by  definition  of  P each  component  of  T is 
related  to  a specific  sample.  For  those  samples  listed 


! 


20. 


r 


above  enter  their  corresponding  t's  or  t' 's  in  the 
appropriate  components  of  T and  enter  zero  for  all  other 
samples . 


For  given  N and  n let  be  a frequency  vector  as 

defined  in  Section  4.  Then  we  have: 

Theorem  6.1.  If  T ^ a trade  and  if  Fj^  determines  a 
sampling  design  equivalent  to  SRS(N,n)  then  Fj^  + T 
determines  a sampling  design  which  is  equivalent  to 
SRS(N,n)  provided  that  no  entry  of  Fj^  + T negative . 


Example  6.2.  Let  N = 7 and  n = 3 and  let  F^^  be  the 
column  vector  with  all  its  entries  equal  to  1.  Let  T be 
the  trade  exhibited  in  Example  6.1.  Then  F^  + T provides 
us  a sampling  design  which  is  equivalent  to  SRS(7,3).  Note 
that  the  support  size  of  the  corresponding  design  is  27 
and  the  design  will  be  nonuniform.  The  corresponding 
sampling  design  can  be  easily  obtained  as  follows.  Delete 
from  the  support  SRS(7,3)  [note  that  the  sampling  design 
associated  with  our  choice  of  F^  is  precisely  SRS(7,3)]  I 

those  samples  v;hose  related  integers  are  -1.  Since  there  | 

are  3 such  samples  we  will  be  left  with  35  “ 8 = 27  j 

samples.  These  27  samples  will  be  the  support  of  the 
sampling  design  associated  with  F^  + T.  The  correspond- 
ing probabilities  are  calculated  as  follows:  The  probabi- 
lity associated  with  a seunple  in  the  new  support  will  remain 
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the  same  if  that  sample  did  not  appear  in  the  trade.  Other- 
wise, the  corresponding  new  probability  is  1/35  + t/35, 
where  t is  its  related  Integer  in  T.  Thus  in  this  case 
probability  associated  with  the  sample  (123)  will  be 

3/35. 

Note  that  the  trade  in  Example  6.1  can  be  used  in  finding 

/N\ 

sampling  designs  ivith  support  sizes  smaller  than  and 

equivalent  to  SRS(N,3),  as  long  as  N > 7. 

Hedayat  and  Li  (1977)  have  studied  the  theory  of  trade 
off  in  the  context  of  BIB  designs  with  repeated  blocks  and 
have  obtained  several  results  directly  applicable  in  sampling. 
For  example,  they  have  shown  that: 

Theorem  6.3.  A trade  of  volume  i exists  if  and  only  if 
i 4 1,  2,  3 or  5. 

Due  to  limitation  of  space,  several  other  results  on 
trade  off  will  be  reported  elsewhere. 
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estimability  of  the  basic  paramel  rirs  of  U , such  as  the  population  total, 

we  insist  that  the  union  of  the  subsets  in  S.  be  and  P (s)  > 0 

d d 

for  each  s in  S . ‘ . 

d 

The  first  order  inclusion  probability  of  the  unit  i under  d is 


defined  to  be 


n^(i)  = E P^(s), 
s 9 i 


and  the  second  order  (joint)  inclusion  probability  of  the  units  i and 
j (i  ^ j)  under  d is 


n^(i,j) 


E P.(s). 

sM.j 


Since  the  introduction  of  unequal  probability  sampling  by  Horvitz  and 

Thompson  (1952),  the  emphasis  in  the  theory  has  been  towards  working  with 

the  above  inclusion  pro'  abilities.  This  paper  is  mainly  concerned  with 

problems  related  to  these  inclusion  probabilities. 

Two  sampling  designs  d^  and  d^  are  said  to  be  equivalent  with 
respect  to  these  inclusion  probabilities  if: 

n (i)  = n (i),  n (i,j)  = n (i,j),  Vk,j. 

1 2 '^l  '*2 

This  paper  studies  the  extent  to  which  these  inclusion  probabilities  charac- 
terize the  sampling  designs.  This  study  has  led  us  to  sampling  designs 
which  have  applications  in  controlled  sampling.  We  have  studied  the  follow- 
ing problem,  among  others.  The  classical  simple  random  sample  of  size  n 
based  on  , denoted  SRS(N,n),  is  a sampling  design  whose  support,  S.  , 

fwl  . ° 

consists  of  all  ^ possible  samples  of  size  n and  whose  probability 

distribution,  p^  , is  uniform  on  the  support.  Thus  a problem  of  Interest 

is  to  find  sampling  designs  equivalent  to  SRS(N,n)  but  whose  support  size: 
In) 

may  be  less  than  I and  for  which  the  probability  distribution  on  their 
supports  may  or  may  not  be  uniform.  It  is  shown  that  this  can  always  be 
done.  Such  sampling  designs  have  applications  to  controlled  sampling. 
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